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Abstract 

We define and study a family of distributions with domain com- 
plete Riemannian manifold. They are obtained by projection onto a 
fixed tangent space via the inverse exponential map. This construction 
is a popular choice in the literature for it makes it easy to generalize 
well known multivariate Euclidean distributions. However, most of the 
available solutions use coordinate specific definition that makes them 
less versatile. We define the distributions of interest in coordinate in- 
dependent way by utilizing co-variant 2-tensors. Then we study the 
relation of these distributions to their Euclidean counterparts. In par- 
ticular, we are interested in relating the covariance to the tensor that 
controls distribution concentration. We find approximating expres- 
sion for this relation in general and give more precise formulas in case 
of manifolds of constant curvature, positive or negative. Results are 
confirmed by simulation studies of the standard normal distribution 
on the unit-sphere and hyperbolic plane. 



1 Introduction 

We are interested in defining and studying some of the properties of dis- 
tributions on complete Riemannian manifolds. A typical example of such 
manifolds is the unit n-sphere S". In this sense, the subject of our study has 
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as a primary application, but not limited to, directional statistics, a branch 
of statistics dealing with directions and rotations in M". 

Pioneers in the field are Fisher, R.A., [6] and von Mises. In recent years 
directional statistics proved to be useful in variety of disciplines like shape 
analysis [9], geology, crystallography [8], bioinformatics [11] and data mining 

The best known distribution from the field of directional statistics is the 
von Mises-Fisher distribution. It is defined on the unit n-sphere by the 
density 

/n(x; /i, k) = Cn{k) exp(/c/i x), X G §'\ 

where /c > 0, G S" and normalizing constant Cn{k). It is applied initially 
for studying electric fields (n=2). Its one dimensional variant, the von Mises 
distribution, is also known as the circular normal distribution. 

Another important distribution is Fisher-Bingham-Kent (FBK) distribu- 
tion, proposed by Kent, J. in 1982. It is defined on by the density 

/(x) = ^^^^^exp{K7i ■x + /3[(72-x)2- (73.x)=^]}, 

where 71, 72 and 73 are three or ho normal disrections in M^. A recent appli- 
cation of Kent distribution can be found in [7] . 

The family of centered distributions we are going to consider includes 
von Mises-Fisher distributions but not FBK distributions which are of mixed 
nature. Centered distributions are obtained by projecting the distribution 
domain onto a fixed vector space, namely a tangent space on manifold. This 
approach is well known and easy to implement. However, we think that 
not all of its aspects are treated rigorously. One problem that needs care 
is defining distributions in coordinate free manner. This issue is important 
when the domain is a compact Riemannian manifold as and does not 
accept a global parametrization. Another problem arises in the study of 
covariance, which has coordinate specific nature. Only those properties of 
distributions that are coordinate system invariant are relevant in comparison 
studies. 

Here we do not target a specific application, but rather aim at generaliza- 
tion and pedagogical improvement over the existing solutions like providing 
coordinate free definition of large class of distributions on complete mani- 
folds. 

Another direction in this study is the impact of domain curvature on 
the covariance of distributions of interest. Again, we improve upon some 
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existing results [12], by generalizing and being more precise. Finally, we 
provide simulation results, something that up to our knowledge is missing in 
the literature, that illustrate and confirm the formal developments on specific 
spaces of constant curvature, the unit 2-sphere and hyperbolic plane. 

2 Definition of centered distributions 

Let M be a Riemannian n-manifold, g G M and let Expq be the exponen- 
tial map at g, Expq : TqM M. If M is complete, then the exponential map 
Expq is defined on the whole tangent space TqM. Throughout this paper we 
will assume that M is a complete Riemannian n-manifold. 

There is a maximal open set B{q) in TpM containing the origin, where 
Expq is a diffeomorphism. Then the set B{q) = Expq{B{q)) is called maximal 
normal neighborhood of q. On this normal neighborhood the exponential 
map is invertible and let 

LoQq = Exp^' : B{q) T^M 

be its inverse, the so called log-map. LoQq is diffeomorphism on B{q). 

The Borel sets on M generated by the open sets on M form a cr-algebra 
^ on M. Any Riemannian manifold has a natural measure V on A, called 
volume measure. In local coordinates x it is given by 

dV{x) = ^/\G{x)\dx, 

where G{x) is the matrix representation of the metric tensor, |G| is its de- 
terminant and dx is the Lebesgue measure in W^. More details one can find 
in [1], ch. 3.3. 

We consider a family Q of distributions on M given by density differentials 

dQ{p; q, T, /) = kf{T{LogqP, Logqp))dV{p), (l) 

where q G M, T is a symmetric and positive definite co-variant 2-tensor (bi- 
linear form) at tangent space TqM, / : M ^ M"*" is a function on M and k is 
a normalizing constant. We call the elements of Q centered distributions for 
an obvious reason - their densities are defined via projection onto a single 
tangent space (TqM) placed at a central point (g). Note that their intrinsic 
means may or may not coincide with q. Also, as defined the distributions 
from Q are absolute continuous with respect to the volume measure with 
kf(T{.,.)) being their densities. 
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A particular member of the family Q takes 

T{X, Y) =< X,Y >,X,Y e TgM, 

/(t)=exp(-^t), 

and defines the so called standard normal distribution on M at q. 

Sometimes we want the log-map to be surjective on the entire tangent 
space at q except eventually a subset of measure zero. With the current 
definition of the log-map we have Logq{B{q)) = B{q) and when the cut locus 
Cut{q) of q is non-empty, B{q) is a bounded star-like neighborhood in TpM. 
Can we extend the definition of Logg so that it covers the maximal possible 
image of B{q), TqMl We are going to introduce a multi- value version of the 
log- map designed to meet this requirement. 

The set of critical points of Expq, i.e. the set where Expq is not diffeo- 
morphism, is closed and with volume measure zero, (for more details see 
[1], Th 3.2 and Prop. 3.1). In fact, the set of non-critical points of Expq is 
exactly B{q), the maximal normal neighborhood of q. Thus, we have that 
B{q) is open in M and V{M\B{q)) = 0. For any p G B{q), there exists a 
neighborhood V of p such that V C B{q). Since W = Expq^{V) is open in 
TpM, which has a countable basis, W has countably many connected com- 
ponents, W = Ui>iWi. Moreover, each connected component Wi of W maps 
diffeomorphically on V by Expq. Therefore, if we consider B{q) to be a 
submanifold of M, then the map 

Expq:Exp;\B{q))^B{q), 

is a covering of B{q). In fact, we can take V = B{q) and then 

Expqlwi ■■ Wi -> B{q) 

are diffeomorphisms. 

Define Logq\]Y-{p) = Vi, for the unique Vi G Wi such that Expqivi) = p. 
The diffeomorphisms Logq\wi '■ ^{l) ~^ Wi we call leafs of the log- map. The 
multi-value version Logq of Logq is defined on the entire B{q) by 

LogqP = {Logq\wXp)}i>i^P ^B{q). (2) 

We define 

oo 

f{T{Logqp, Logqp)) = ^ f{T{Logqp\wi, Logqp\wi))- (3) 

i=l 
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and then the distribution form ([T]) has to be read as 



dQ{p; q, T, f) = kf{T{LoggP, Loggp))dV{p) 



oo 



k f{T{Logqp\w,, Logqp\wJ)dV{p). 



1=1 



We refer to the operation ([3]) as folding a density. Basically, the support of 
/ determines how many leafs of the log-map we use. 

Recall that if (xi,...,a;„) is an orthonormal basis of T^M, the normal 
coordinate system on Wi is given by 

V = {V^, ...,V'^) 1-^ (t>{v) = ExPq{v^Xi + ... + v'^Xn),V^Xi + ... + f G Wi 

and the log-map is particularly simple, Logq\wi{v) = Logq(f){v)\]v^ = v. 

The benefit of introducing ([2]) and is clear when one integrates func- 
tions of the log-map. For example, expectation with respect to Q & Q of any 
measurable function h{p) on M is 



where the last integral is the Lebesgue one on the whole R"'. Using the 
multi- value log- map all density functions / in M", like the normal ones, can 
be manipulated easier on a general manifold M, because we do not change 
their support. 

Example 1 Let M, he the unit n-sphere S". Fix a point g G The 
cut locus point for q is —q, the antipodal point. Thus, B{q) = g}. 
Define Uk = Bk^^^q), the ball on TpS" with radius kir, k > 1. The maximal 
normal neighborhood for q is Ui. We have Exp~^{B{q)) = UiWi for Wi = 
B{i+i)-K{q)\BiT,{q). Let riqp = Logqp/\\Logqp\\ be the unit tangent vector at q 
in the direction of p, then Logq\B^{p) = d{q,p)nqp for 

d{q, p) = cos~^ < q,p > G [0, vr] 






and 



Logqp = {{d{q,p) ± 27ii)nqp}i>o. 



5 



Remark 1 In a sense, the proposed extension of the log-map with corre- 
sponding modified distributions ^ is generalization of the concept of wrapped 
distributions. These are densities f on the line, 'wrapped' around the circum- 
ference of the unit circle S^; f{9) = J2iL-oo f + 27ii),9 G [0,27r), as used 



-oo 

zn 0/. 



Example 2 Von Mises-Fisher distribution is a centered distribution of form 
(QP if we take q = fi, T{v,v) = v'v = f(t) = coexp{kcos(t)), t G [0, 27r] 

and a normalizing constant cq. Support of f is bounded and we use only the 
first leaf of the log-map. 

Example 3 Gamma distribution on M can be defined by 

f{t) = cot''-^exp{-t/e),t> 
for 6 > and T{v,v) = v'v. Constant Cq is determined by 

Co ^ = / \v\''~'^exp{-\v\/9)dv = 



' ^ ' r((n + l)/2)' 

where we used that the area ofEi"^ is '^Y{^^+iy2) ■ Because the support of f is 
the whole M., we have a folded density. 

Unfortunately, both von Mises-Fisher and Gamma multivariate distribu- 
tions do not have explicit expression for their second moments which make 
them less useful in the context of the following results. 



3 Approximating the covariance 

Let Q be a distribution from Q. Covariance of Q we call a contra- variant 
2-tensor at tangent space TgM given by 

S = /c / {Loggp){Loggpy f{T{LoggP, Logqp))dV{p). 
Jp 

Note that when q is the mean (intrinsic) of Q, E is a covariance in the 
usual sense, but here we do not require g to be a mean and we use the 
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term covariance in a different context, namely, as a quantity measuring the 
dispersion about the center q of Q. 

We want to obtain an approximating expression for the covariance of Q 
as a function of the tensor T and the first few moments of f. 

In normal coordinates v at q, the volume measure can be approximated 

by 

dV{v) = [1 - lv'{mc)v + 0{\v\^)]dv, (4) 
6 

where Ric is the matrix representation of the Ricci tensor (see for example 
Th. 2.17 in Chavel [1]). 

X. Pennec [12] used equation (j4j) to approximate the covariance of normal 
distribution. His approximation is S » — ^T~^{Ric)T~^ . 

We use the above approximation of the volume form to obtain more gen- 
eral result applied for densities of centered distributions given by ([T]). In 
addition, we derive more precise variance estimation on the unit 2-sphere 
and the hyperplane. Finally, we provide some simulation results to confirm 
the formulas. 

Let T be the matrix representation of tensor T with respect to coordinates 
V. Let = UAU' be the eigenvalue decomposition of with diagonal 
matrix of eigenvalues A. Define S = f/A^/^. Then = SS'. The determi- 
nant of S is \S\ = |r|~-^/^ and its norm is \ \S\ \ given as 1 15*11 = sup{| |S'x| I2, 
\\x\\2 = !}• is the maximal eigenvalue of 5", which is strictly positive. 
Moreover ||^|| < | |f/| 1 1 lA^/^l | < ||T-^||^/2 ^ ^-1/2^ ^j^g^e X^in is the minimal 
eigenvalue of T. 

We change the variables v to w = (wi) according to 

V = Sw. 

Then v'Tv = w'w and vv' = S{ww')S'. Density / is assumed to satisfy 

f{w'w)dw = 1, / wf{w'w)dw = 0, (5) 
and let 

ww' f{w'w)dw = C, [{ww') ® {ww')]f{w'w)dw = D. (6) 

C is a symmetric and positive definite nxn matrix, while D is the expectation 
of the Kronecker product {ww') ® {ww') and thus, it is a ra^ x ra^ matrix. Let 
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D = {D^i^i}kiij and for every k,l G {1, ...,ri}, D^i is the corresponding n x n 
matrix. Let R = S'{Ric)S = (rij). By tr{RD) we will understand the n x n 
matrix with elements [tr{RD)]ij = Xlfci^^fci-^M- 

Now we are ready to formulate the following 
Lemma 1 Under the assumptions ^ and the density form (Qlj has 
normalizing constant 

k-' = \S\il-]:triRC)+e) (7) 



and covariance 

k~^T. = \S\S{C - ltr{RD) + eQS'. (8) 
6 

where the function e{S) = 0{\\S\\^). 

The Proof is a straightforward derivation. First observe that by definition 

k-^ = [ f{v'Tv)dV{v) (9) 

and 

fc-^S= f vv'f{v'Tv)dV{v). (10) 
Jr" 

assuming LogqE{q) = R"', which we can always guarantee by folding, even- 
tually, the original density / (see definition ((31)). In the rest of this section 
all integrals are assumed with domain M". 

We proceed by expressing the terms that appear above when the volume 
form is replaced by approximation (j4j). Obviously, / f{v'Tv)dv = \S\ and 
then 



v'{Ric)vf{v'Tv)dv= I tr{w'Rw)f{w'w)\S\dw=\S\tr{RC). (11) 
Similarly 

\v'fiv'Tv)dv = Si I ww'fiw'w)\S\dw)S' = (12) 



Then we derive 

vv'{v'{Ric)v)f{v'Tv)dv = \S\S{ I ww'{w'Rw)f{w'w)dw)S' 



with the {ijY^ element of the last integral equal 

WiWj{w'Rw)f{w'w)dw= / WjWjiy^^ WkrkiWi}f{w'w)dw 



k,l 



y^.rki / WiWjWkWif{w'w)dw = [tr{RD)]ij. 
k,i 

Thus, 

j vv'{v'{Ric)v)f{v'Tv)dv = \S\Str{RD)S'. (13) 
Finally for the error term we have 

\v\^f{v'Tv)dv < I llsllVH-^Mu^ < 11^11^+" / Iwl'^dw, 



using the fact that \S\ < \\S\\^. Since given the assumptions we made the 
last integral is bounded, we have 

v\^f{v'Tv)dv=\S\0{\\Sf). (14) 



Plugging ([TT]), ([T2]), ([T5]) and ([11]) into (|9]) and ([TOD one obtains the claim. □ 
Formulas Q and fITU]) are given with respect to a normal coordinates 

y, which are not unique. We will show how they change with a change of 

coordinates and what is invariant to such a change. 

Let V be another normal coordinate system at q and matrix A be the 

Jacobian of the change from v to i.e. v = Av. A is orthogonal matrix, 

A G 0{n). 

Since T is a symmetric positive definite co-variant 2-tensor then is a 
contra- variant 2-tensor and so it is S. Under the coordinate change we have 

T-^ ^ AT~^A', S ^ AS, and S ^ ASA'. 

Matrices C and D remains unchanged and so does R = S'{Ric)S, because 
Ric is a co-variant tensor such that Ric ^ {A~^y{Ric)A~^. Moreover 

s~'^{s-'y ^ s~'A~^Aj:A'{A~^y{s~^y 

and hence, the above quantity is also coordinate system invariant. We showed 
the following 
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Lemma 2 Matrix S ^S(S' is an invariant to the normal coordinate sys- 
tem at q and satisfies 

, , C-kr(RD) + eIn , , 

where e{T) = 0{\\T-^\\^/^). 

Example 4 We take a normal distribution on M, defined by 

/(t;) = (27r)-"/2exp(-VTt;) 

for a co-variant tensor T. Since J wf f{w'w)dw = I, J wff{w'w)dw = 3 and 
[tr{RD)]ij = Tij + Tji + nSijTij, we have 

C = In, tr{RC) = tr{R),tr{RD) = 2R + n diag{R). 

Moreover 

SCS' = SRS' = T~\Ric)T~^ 
and the lemma claims that 

^ - \T~\Ric)T~^ - lSdiag{S'{Ric)S)S' 
1 - \tr{T"^Ric) ' 

which is different from the approximation S T^^ — \T^^{Ric)T~^ given in 

m 



4 Standard normal distribution on the unit 
sphere 

The folded normal distribution on the sphere §" is given by 

dQ{p) = A;(27r)-"/2exp(-ir(L^,L^))rfy(p), (16) 
with the following extended expression 

dQ{p) = /c(27r)-"/2^exp(--(l ± ^^^^'^ ^^^ fT{Log,p, Log,p))dV{p). 
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Above we sum two terms for each i; this is what ± stands for. 

In particular, if we assume that in normal coordinates v, T = a~'^In, 
then the Euclidean standard normal density k{2n)~"'^'^ exp{—^v'Tv)dv has 
covariance C = In and kurtosis matrix D = {-D^;} such that D'^l = D^^,^ = 1, 
for k I, D]^^ = 1, for k i, D^'l = 3, k I and zero otherwise. 

For this particular T, the density (fT6|) is 

oo ^ 

dQ{p) = fc(2vr)-"/2^exp(- — (||Lo^7,p|| ±27rz)2)dl^(p). (17) 

i=0 

On the sphere, the Ricci tensor matrix is Ric = In and since tr{RC) = na"^ 
and tr{RD) = {n + 2)a^In we can simplify ((Tj) and (IHD to 

k~' = a-[l-y + 0{a')] 

D 

and 

k-'j: = a^[l-'^a' + Oia')]a'ln. 
o 



For n=2, we write 



2^2 



We can benefit from a better approximation of the volume form and 
derive more precise estimation than f|T8l) . The volume form of S" in normal 
coordinates v (see for example 2.3 in [1]) is 

dV{v) = '^^^dv 
\\v\ I 

with Taylor expansion 

dV{v) = [1 - + -^\\v\\' + 0{\\v\\')]dv. (19) 



Utilizing the equations 



{vv')exp{-^v'v)dv = a\2TiY/^In 



(ii) 

{vv'){v'v)exp{-^v'v)dv = {n + 2)a^{2TiY'^In 
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(iii) 



(iv) 



(v) 



{v'v)exp{--^v'v)dv = ncr^(27r)"/^. 



{vv'){v'vfexp{--^v'v)dv = (n^ + 3n + ll)(T^(27r)"/2j„ 



{v'vfexpi \v'v)dv = n{n + 2)(T^(27r)"/2. 



one can show following 



Lemma 3 The standard normal density on S" given hy [11} has 

no , n(n + 2) 4 . 



A; = (27r) ' cr [1 - -a + — ^^^^ — a + ^((t 



and 



V ; L 6 120 V ;j n 



In particular, for n = 2, 



and we expect tr(S) to be underestimate for a^. 

This conclusion we confirm by simulation studies. Figure ([T]) shows the 
results from our experiment. Let (x,y,z) be the cartesian coordinates in M^. 
We generate samples from a normal distribution with mean q= (0, 1, 0) and 
T = for different values of a shown in blue. For every value of a, 100 
samples are drawn to estimate the covariance S. The green curve shows the 
prediction according to ( 120|) . The red one shows o"^ = trE. As we see for 
n = 2 and tr(T~^) < 1, stays close to the predicted value (120|) . 
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Figure 1: Estimation of of normal distribution on with T = a~'^l2- 
Values of a are in blue and decreases from 1 to 0.01 in the left figure and 
from ^2 to 72/100 in the right one. Green curves correspond to the predic- 
tion function T3f-2TX~4^^! given by equation fl20l) . Red curves show the 

^ 3 + 15 

estimates calculated using 150 samples for each a. 



5 Normal distribution on hyperbolic spaces 

The hyperbolic space H"^ is a Riemannian n-manifold, defined as the half- 
space {(xi, ...,Xn), Xn > 0} of cudowcd with the metric represented by 

gij{x) = ^. 

H" is geodesically complete and for any point q e ]HI"0, the exponential map 
at q, Expq : M" H" is a diffeomorphism on the whole tangent space. Thus, 
the cut locus, Cut{q), is empty. It is said that is a manifold with a pole. 
A normal distribution on is given by 

dQ{p) = k{2iT)-' eM-\T{Log,p. Log,p))dV{p). (21) 

In particular, if we assume that in normal coordinates v, T = (T~'^In, then 

dQ{v) = k{27r)~''/^exp{-^\\v\\^)dViv). 

The hyperbolic plane has a constant curvature of -1 and the Ricci tensor 
matrix is Ric = —In (for details see [3], ch. 8.3). 
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In two dimensional case, n = 2, we can simplify ([7]) and ([H]) to 

o 

3 

and 

We will derive a much better covariance approximation using more precise 
volume expression. 

The volume form of hyperbolic n- manifold is (see for eample 2.3 in 

m) 

dV{v) = ^^^^ci. = e^P(ll^ll)-exp(-||.||) 
ll^ll 2||w|| 

and consequently 

dV{v) = [1 + + ^\\v\\' + 0(||i;|r)]rfi;. (23) 

Similarly to the unit n-sphere case, we obtain 
Lemma 4 The standard normal density on H" has 

k-' = (27r)"/V"[l + -a' + !^i!i±^a^ + 0{a% 
y J ^ 6 120 ^ 



and 



fc-S = (2.)"/V-^[l + i^a^ + (!!!±^!i±ll),4 + o(a^)]4. 



In particular, for n = 2 



1 + + „ ^ ^ 

X -r -r -^gu 



Therefore we expect tr(S) to overestimate cr^. This conlcusion we confirm 
experimentally (see Figure ([2])). When < 1, 6"^ stays close to the predicted 
value i^^. For larger values of more precise approximation is needed. 

Upon request we provide MATLAB programs for the experiments shown 
in Figures ([T]) and ([2]). 
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Figure 2: Estimation of o"^ for normal distribution on with T = a ^l2- 
Values of a are in blue and decreases from 1 to 0.01 in the left figure and 
from to 72/100 in the right one. Green curves correspond to the predic- 
tion function 7xi~2TX~4^^! given by equation i^M)- Red curves show the 
estimates calculated using 200 samples for each a. 

6 Summary 

In this study we try to be more precise and general when defining dis- 
tributions on complete Riemannian manifolds and on compact manifolds in 
particular. We give a consistent definition that accounts for the lack of global 
parametrization on manifolds by being coordinate independent. Also, coordi- 
nate specific attributes, like concentration matrix and covariance, are treated 
more carefully. They are considered as tensors of appropriate variety. The 
motivating idea behind this point of view is that only coordinate invariant 
objects should be used for statistical inference purposes. 

The families of centered distributions we dealt with, are usually based on 
Euclidean multivariate kernel, like the normal one. That makes the problem 
of relating the covariance of manifold variable to its Euclidean counterpart 
interesting. We expressed formally one possible relation in this regard and 
confirmed it with simulations. Our experiments include normal distribution 
on the unit 2-sphere, which is of interest of directional statistics, and normal 
distribution on the hyperbolic plane, which lack application potential for the 
moment, but it is an interesting demonstration by itself for clearly showing 
the impact of the negative curvature of the domain. 
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